Nuclear Science and Techniques 24 (2013) 040403 


A low dead time vernier delay line TDC implemented in an actel 
flash-based FPGA 


QIN Xi"? FENG Changqing?” 


LIU Shubin!” 


ZHANG Deliang’* ZHAO Lei!” 


AN Qi!” 


'State Key Laboratory of Particle Detection and Electronics, University of Science and Technology of China, Hefei 230026, China 


"Department of Modern Physics, University of Science and Technology of China, Hefei 230026, China 


Abstract 


In this paper, a high precision vernier delay line (VDL) TDC (Time-to-Digital Convertor) in an actel 


flash-based Field-Programmable-Gate-Arrays A3PE1500 is implemented, achieving a resolution of 16.4-ps root mean 


square value or 42-ps averaged bin size. The TDC has a dead time of about 200 ns while the dynamic range is 655.36 


us. The double delay lines method is employed to cut the dead time in half to improve its performance. As the bin size 


of the TDC is dependent on temperature, a compensation algorithm is adopted as temperature drift correction, and the 


TDC shows satisfying performance in a temperature range from —5°C to +55°C. 
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1 Introduction 


The TDC (Time-to-Digital Convertor) implemented in 
FPGAs (Field-Programmable-Gate-Arrays)"" >! is a 
flexible, low cost for measuring time of flight (TOF) 
in particle physics and plasma experiments. In SRAM 
FPGAs, a 50 ps resolution and a 10 ns dead time can 
be achieved by the time interpolating method 
employing dedicated carry lines as the delay 
elements!”!: and in Actel Flash-based FPGAs, a 
resolution of 540 ps can be obtained by employing 
buffers as delay elements, and the resolution can be 
improved to 130 ps by eliminating the buffers from the 
delay line and using only the routing lines'*!, 
Single-Event-Effects (SEE) test results showed 
that the Actel flash-based FPGAs had a better 
performance than the SRAM FPGAs in single-event 
-upset (SEU) immunity”. Thus the flash-based 
FPGAs can be applied in some space missions to meet 
the radiation-tolerant requirements. However, different 
from SRAM FPGAs, the flash-based FPGA had no 
dedicated carry lines, and the shortest delay time 
through a logic block is hundreds of picoseconds. Thus, 


time is difficultly measured in a resolution of dozens 
of picoseconds by the flash-based FPGAs using time 
interpolating method. 

The “Vernier Delay Line” (VDL) method is 


utilized by convertors in 


Application Specific Integrated Circuits (ASIC)U* l 


some Time-to-Digital 


Rather than the vernier TDCs using crystal oscillators 


[1415] the TDC utilizing vernier 


of different frequencies 
tapped delay lines can provide higher resolution, larger 
dynamic range and shorter dead time. Because the 
propagation delay of vernier elements has temperature 
and voltage dependent, the voltage control circuit and 
delay-locked-loop (DLL) are integrated to ensure the 
stability of TDC bin size. However, such circuits are 
not integrated in FPGAs, thus the FPGA-based vernier 
delay line TDCs was rarely reported. 

In this paper, a high resolution TDC based on 
both the VDL method and the time interpolating 
method is implemented in Actel Flash-based FPGA 
A3PE1500. A “double delay lines” method is utilized 
to cut the 
compensation algorithm 


dead time in half. A temperature 
is operated in a wide 
temperature range. The architecture design and 


performance tests are described. 
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2 Architecture 


2.1 Vernier time interpolating architecture 


The FPGA core of A3PE1500, which consisted of 
38400 VersaTiles, was configured as three-input logic 
functions, D-flip-flops, and latched by bate amas 
the appropriate flash switch interconnections''®!, The 
VersaTiles exhibited propagation delays when it was 
used as different combinatorial cells. In our design, the 
propagation delay difference of AND3 and MUX2 was 
used as the VDL elements, and a Vernier TDC, which 
has with an average bin size of less than 50 ps, was 
theoretically formed. 

The VDL block diagram is shown in Fig.1. The 
architecture as a time stamp vernier TDC is based on a 
coarse counter and interpolator units formed by VDL. 
Simulation results show that the propagation delay 
difference of AND3 and MUX2 units leads to a 
minimum bin size of about 50 ps. The leading signal 
for the vernier lines is exactly the “hit” pulse. The 
lagging signal is generated by a D-flip-flop fed by the 
hit pulse and master clock. Actually, the time 
measured by the vernier delay line is the interval 
between hit event and the proximal clock rising edge. 
As the TDC is fed by the master clock using a period 
of 10 ns, the time stamp TDC provides a wide time 
measurement range of about 655.36 us by a 16-bit 
coarse time counter, and the range is further expanded 
by increasing bit numbers of the counter. The encoder 
unit transforms the fine time information output by the 
VDL into 9-bit data. The Read-Out FIFO stores the 
integrated TDC data, including16-bit coarse time data, 
9-bit fine time data, and the TDC channel ID. The 
“Read En” is the read enable signal for the coarse 


counter latches and encoder unit. 


Bem OűEű 
Li — 
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Fig.1 Block diagram of the “Vernier Delay Line” TDC. 


Figure 2 is the timing diagram of generating 
the “Read En” pulse. Once hit signal arrives, a D 
flip-flop at the proximal clock rising edge generates 
the lagging signal. At the next clock rising edge, the 
lagging signal is latched by another D flip-flop to 
output a reverse pulse. Both lagging signal and the 
reverse pulse are fed to an AND2 gate, outputting a 
“Read En” signal. The generated signal with one clock 
period is synchronous to the hit signal. “Read En” can 
be used as the coarse counter reading enable signal, 
and fed to encoder unit for VDL output fine data 
reading enable and FIFO writing enable after delaying 
for several clock periods. 
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Fig.2 Timing diagram of generating the Read Enable Signal. 
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2.2 Double delay lines method 


Dead time spent by the lagging signal of the vernier 
delay line TDC is maximal to overtake the leading 
signal, this depends on the number of delay cells and 
the propagation delay of a single cell. In a single 
vernier delay line covering a 10 ns clock period, there 
are the 300 propagation delay combinatorial cells, and 
the dead time is close to 200 ns. 

A “double delay a method is employed to 
11, Fig.3 shows the double 
delay lines TDC structure. Because clocks fed to the 


obtain a shorter dead time”! 


two VDLs have the same frequency with inverted 
phase, the lagging signals for the two delay lines are 
generated using time difference of a half clock period. 
Both of the vernier delay lines cover the half clock 
period, their number of required cells for both lines 
decrease in half, and the dead time decreases to 100 ns. 
Each hit signal is measured by both delay lines, but 
just one of the two output codes is selected as the time 
information. The code selection is based on the fine 
time data from the two vernier delay lines. The clocks 
for the double delay lines are generated by two 
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methods, that is, two internal PLL cores of FPGA with 
inverted phases, and both rising and falling edges of 
the master clock. The TDC performance by two clock 
methods is shown in the test results. 


Clock1 
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Fig.3 Schematic of the double delay lines TDC. 
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3 Test results 


In 2012, a prototype, which had a “vernier delay line” 
TDC implemented in Actel Flash-based FPGA, was 
designed and tested. Fig.4(a) shows the TDC board 
based on a Universal Serial Bus (USB) port. Fig.4(b) 
shows its testing block diagram. 
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Fig.4 Prototype of Actel FPGA-based Vernier delay line TDC 
(a) and the testing block diagram (b). 


In Fig.4(b), testing hit signals for the TDCs are 
output by a pulse generator; and the LVDS level 
signals for each hit by the discriminators. Time 
information of the input hits are measured by the Actel 
FPGA-based TDC, and the output time codes are 
transmitted to the data processing computer through 
the USB port. 


3.1 Bin size and differential non-linearity 


In this design, the TDC bin size equals to the 
propagation delay difference between AND3 and 
MUX2 units. The differential non-linearity is mainly 
contributed by the master clock skew and the 
disproportionate widths of the vernier delay chain cells. 
DNL (differential non-linearity) is defined as the 
deviation of bin size from its ideal LSB (Least 
significant bit) value, and INL (integral non-linearity) 
is the deviation of the input/output curve from the 
ideal transfer characteristic, which is a straight line 
fitting the curve best. To obtain the non-linearity of the 
TDC, the code-density test method was adopted!'®!, 
Fig.5 shows the code-density tests for vernier TDC 
with a single chain. Because the non-linearity repeats 
at every 10 ns, a look-up table (LUT) can be 
constructed by the INL information to compensate for 
the TDC outputs. 


140 
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Fig.5 Results for code-density test of single chain vernier 
TDC. (a) Bin size information of single chain vernier TDC, (b) 
DNL of single chain vernier TDC and (c) INL of single chain 
vernier TDC. 
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Figure 5(a) shows the bin size information for 
a single vernier delay chain implemented in 
A3PE1500, and the average is about 42 ps. The first 
bin, which is larger than others, leads to the worst 
non-linearity of the TDC. The large bin results from 
the D-flip-flop outputting the lagging signals. When 
the hit signal arrives accompany with the clock rising 
edge, ambiguous state appears. The ambiguous state 
leads to a slower rising edge for the lagging signal. In 
the code-density test, the slower lagging signals finally 
accumulate on one bin, thus generating an ultra wide 
bin. Due to the 135 ps bin, DNL of the TDC is —1/+ 
2.2 LSB, and INL is —1.4 /+3.7 LSB. 

Fig.6 shows the waveforms of the slower 
lagging signal. 


Clock 
Hit (Lead Signal) 
Lagging Signal 


Fig.6 Waveform of the slower lagging signal. 


The bin size and non-linearity information of 
vernier TDC are obtained using the “double delay 
lines” method (Fig.7). The two lines cover a little time 
longer than half a clock period, hit signal arriving at 
the clock rising edge is tapped at the start of one chain 
and another end. By selecting the correct time 
information, the ultra wide bin exists in the TDC 
channel no longer. DNL of the vernier double delay 
lines TDC is in the range of —1 to +0.9 LSB, and the 
INL is —1 to +3.4 LSB. The INL nonuniformity is 
contributed by the disparity of averaged bin size 
between two delay lines. 


3.2 Time measurement resolution 


The resolution for a time measurement system, which 
is one of the most important parameters, can be 
obtained by the “cable delay test”. However the cable 
length is limited because long cables can lead to the 
attenuation of the input signal and slow the leading 
edge, thus introducing measurement error. So when 
measuring different time intervals, a dual channel 
arbitrary function generator Tektronix AFG3252 is 
employed to obtain the resolution'"””!, Because the two 


channels of the generator output cognate signals with 
adjustable delays, a wide range of time intervals is 
applied in the test. 
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Fig.7 Results for code-density test of “Double delay lines” 
vernier TDC. (a) Bin Size of “double delay lines” vernier TDC, 
(b) DNL of “double delay lines” vernier TDC and (c) INL of 
“double delay lines” vernier TDC. 

An INL look-up table is used to correct the INL 
error. The TDC utilizing a single vernier delay line 
exhibits a resolution of more than 50 ps RMS before 
INL compensation, and the resolution is about 20 ps 
RMS after INL compensation. Fig.8 shows the time 
resolution of the vernier delay line TDC. 
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Fig.8 Time resolution of vernier delay line TDC. (a) 
Resolution before INL compensation, (b) Resolution after 
compensation of single delay line TDC and (c) Resolution after 
compensation of double delay lines TDC. 

The double vernier delay lines can eliminate 
the ultra wide bin from the TDC channel, an 
improvement of about 3.7 ps in RMS can be obtained 
comparing with the single delay line TDC (Fig.8(c)). 

If the time interval of the paired input pulses is 
in one clock cycle, so is the measurement resolution. 
Fig.9 shows the time resolutions of 0 ns to 20 ns in the 
step of Ins. All RMS curves repeat at every 10 ns 
which equals to the clock period. 

When the measured time interval equals to 
NxT (T is one clock period), the double delay lines 
TDC with clocks of inverted phases exhibits RMS of 
less than 20 ps after INL compensation, however the 
RMS becomes worse when measuring other time 
intervals. In other words, the TDC provides the best 
precision when selecting the output time codes of the 


paired input hits from the same delay line, and RMS 
becomes worse when outputting the time codes from 
different delay lines. The two clocks are generated by 
integrated PLLs of FPGA, and stacking of clock noise 
results in worse resolutions. At the mean time, for the 
double delay lines TDC to employ rising and falling 
edges of the master clock, the RMS floats around 20ps. 
The performance of single delay line vernier TDC is 
similar to the previous one, but RMS increases by a 
few extra picoseconds. So the double vernier delay 
lines TDC using both clock edges has the best 


performance in time measurements. 
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Fig.9 Time resolutions for various delays. (A) Double delay 
lines TDC using both rising and falling clock edges, (a) TDC 
using single delay line, and (c) Double delay lines TDC using 
clocks of inverted phases. 


3.3 Bin size drifts 


The delay time of the vernier elements drift when the 
ambient temperature changes, tests for the bin size are 
performed in a temperature control box (Fig. 10). 
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Fig.10 Bin size drifts for various temperatures. 
When the ambient temperature changes from 
—5°C to +55°C, the averaged bin size of the TDC 
increases from 39.8 ps to 44.5 ps. Cell delay varies 
linearly with temperature, and the calculated slope is 
about 0.0807 ps/°C. A functional relation of LSB = 
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39.6 ps +0.0807 ps/°Cx(T+5°C) is obtained. Once 
working in the environment where temperature 
changes fiercely, the random error increases to 
hundreds of picoseconds without temperature drift 
correction, its compensation mechanism for a given 
operating temperature is necessary to correct the tap 
delay. Look-up tables are usually re-generated by a lot 
more memory space when the ambient temperature 
changes. In our temperature compensation algorithm, 
only a look-up-table generated at —5°C. 

The algorithm is performed as following: 
Firstly the time information for each hit is corrected by 
the —5°C look-up table. Secondly, since TDC bin size 
changes with temperature, the LSB at the operating 
temperature is calculated, and a ratio is obtained just 
dividing the calculated value of LSB by the value at 
—5°C. Finally, the time information is further corrected 
by multiplying the ratio of LSBs in the previous step. 

Figure 11 shows the resolutions at different 
temperatures at 0 ns time interval. The RMS floats 
below 18 ps when using look-up tables at each 
temperature, while the resolution becomes a little 
worse when using the temperature drift compensation 
algorithm at the —5°C look-up-table, in which more 
vernier elements are occupied at lower temperatures 
due to the smaller bin sizes, and more bin information 
is included. There is a little increment in RMS once 
the operating temperature departs from —5°C, and the 
Thus this 
temperature drift compensation algorithm is better due 


worst RMS remains below 22 ps. 


to its accuracy and convenience. 
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Fig.11 Time resolution in different temperatures. (m) Correct 
by a fixed Look-up table and temperature drift compensation 
algorithm, (A) Correct by the look-up tables generated at 
different temperatures. 


4 Discussion 


4.1 Coarse counter 


The 16-bit coarse counter working under a clock of 
100MHz frequency contributes to a 655.36 us 


dynamic range. As no dedicated carry lines are 
integrated in Actel Flash-based FPGAs, counters in the 
devices are formed by combinatorial cells. The carry 
delay of the counter, which is contributed by the 
delaies of the cell propagation and routing lines, must 
be confined within one clock cycle. Thus layout 
constraints are aimed to shorten the routing delays, 
and all counter combinatorial cells are laid in a small 
region of the FPGA. 


4.2 Dead time 


To adapt to practical applications, the method of 
double delay lines is employed to decrease the dead 
time of the vernier delay line TDC. Using both edges 
of the master clock, the maximum dead time can be 
decreased to 100 ns. The TDC channel is disabled 
when measuring one hit. Also, the dead time is related 
with temperature, 10% longer dead time exists when 
the ambient temperature increases by 50°C. 


4.3 Bin size and logic resource occupancy 


Bin size of the TDC depends on the time difference of 
the two macro chosen as the vernier cell. Bin size of 
less than 40 ps is obtained by reselecting delay 
elements. Shorter delay cells lead to more elements, 
longer dead time, more complex encoder logic, and 
higher clock frequency. Finally, more elements lead to 
larger occupancy of logic resources; and higher clock 
frequency, more risk for timing integrity. Working 
under a 100 MHz clock, two channels vernier TDCs 
using AND3 and MUX2 macro should cost about 25 
percent of A3PE1500 logic resources. 


5 Conclusion 


Based on time interpolating method and implemented 
in an Actel ProASIC3E FPGA, a vernier delay line 
TDC is reported. A “double delay lines” method was 
employed to cut the dead time in half and improve the 
time measurement performance. A temperature drift 
compensation algorithm was employed to correct 
different temperatures, thus achieving a dead time of 
100 ns, a dynamic range of 655.36 us, a resolution of 
16.4 ps RMS and an averaged bin size of 42 ps in a 
temperature range of —5/+55°C. 
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